Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45.597
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38622356

RESUMO

Identifying disease-associated microRNAs (miRNAs) could help understand the deep mechanism of diseases, which promotes the development of new medicine. Recently, network-based approaches have been widely proposed for inferring the potential associations between miRNAs and diseases. However, these approaches ignore the importance of different relations in meta-paths when learning the embeddings of miRNAs and diseases. Besides, they pay little attention to screening out reliable negative samples which is crucial for improving the prediction accuracy. In this study, we propose a novel approach named MGCNSS with the multi-layer graph convolution and high-quality negative sample selection strategy. Specifically, MGCNSS first constructs a comprehensive heterogeneous network by integrating miRNA and disease similarity networks coupled with their known association relationships. Then, we employ the multi-layer graph convolution to automatically capture the meta-path relations with different lengths in the heterogeneous network and learn the discriminative representations of miRNAs and diseases. After that, MGCNSS establishes a highly reliable negative sample set from the unlabeled sample set with the negative distance-based sample selection strategy. Finally, we train MGCNSS under an unsupervised learning manner and predict the potential associations between miRNAs and diseases. The experimental results fully demonstrate that MGCNSS outperforms all baseline methods on both balanced and imbalanced datasets. More importantly, we conduct case studies on colon neoplasms and esophageal neoplasms, further confirming the ability of MGCNSS to detect potential candidate miRNAs. The source code is publicly available on GitHub https://github.com/15136943622/MGCNSS/tree/master.


Assuntos
Neoplasias do Colo , MicroRNAs , Humanos , MicroRNAs/genética , Algoritmos , Biologia Computacional/métodos , Software , Neoplasias do Colo/genética
2.
Genome Biol ; 25(1): 97, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622738

RESUMO

BACKGROUND: As most viruses remain uncultivated, metagenomics is currently the main method for virus discovery. Detecting viruses in metagenomic data is not trivial. In the past few years, many bioinformatic virus identification tools have been developed for this task, making it challenging to choose the right tools, parameters, and cutoffs. As all these tools measure different biological signals, and use different algorithms and training and reference databases, it is imperative to conduct an independent benchmarking to give users objective guidance. RESULTS: We compare the performance of nine state-of-the-art virus identification tools in thirteen modes on eight paired viral and microbial datasets from three distinct biomes, including a new complex dataset from Antarctic coastal waters. The tools have highly variable true positive rates (0-97%) and false positive rates (0-30%). PPR-Meta best distinguishes viral from microbial contigs, followed by DeepVirFinder, VirSorter2, and VIBRANT. Different tools identify different subsets of the benchmarking data and all tools, except for Sourmash, find unique viral contigs. Performance of tools improved with adjusted parameter cutoffs, indicating that adjustment of parameter cutoffs before usage should be considered. CONCLUSIONS: Together, our independent benchmarking facilitates selecting choices of bioinformatic virus identification tools and gives suggestions for parameter adjustments to viromics researchers.


Assuntos
Benchmarking , Vírus , Metagenoma , Ecossistema , Metagenômica/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Vírus/genética
3.
Skin Res Technol ; 30(4): e13624, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38558219

RESUMO

Chronic urticaria (CU) is characterized by persistent skin hives, redness, and itching, enhanced by immune dysregulation and inflammation. Our main objective is identifying key genes and molecular mechanisms of chronic urticaria based on bioinformatics. We used the Gene Expression Omnibus (GEO) database and retrieved two GEO datasets, GSE57178 and GSE72540. The raw data were extracted, pre-processed, and analyzed using the GEO2R tool to identify the differentially expressed genes (DEGs). The samples were divided into two groups: healthy samples and CU samples. We defined cut-off values of log2 fold change ≥1 and p < .05. Analyses were performed in the Kyoto Encyclopaedia of Genes and Genomes (KEGG), the Database for Annotation, Visualization and Integrated Discovery (DAVID), Metascape, Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and CIBERSOFT databases. We obtained 1613 differentially expressed genes. There were 114 overlapping genes in both datasets, out of which 102 genes were up-regulated while 12 were down-regulated. The biological processes included activation of myeloid leukocytes, response to inflammations, and response to organic substances. Moreover, the KEGG pathways of CU were enriched in the Nuclear Factor-Kappa B (NF-kB) signaling pathway, Tumor Necrosis Factor (TNF) signaling pathway, and Janus kinase/signal transducers and activators of transcription (JAK-STAT) signaling pathway. We identified 27 hub genes that were implicated in the pathogenesis of CU, such as interleukin-6 (IL-6), Prostaglandin-endoperoxide synthase 2 (PTGS2), and intercellular adhesion molecule-1 (ICAM1). The complex interplay between immune responses, inflammatory pathways, cytokine networks, and specific genes enhances CU. Understanding these mechanisms paves the way for potential interventions to mitigate symptoms and improve the quality of life of CU patients.


Assuntos
Urticária Crônica , Perfilação da Expressão Gênica , Humanos , Perfilação da Expressão Gênica/métodos , Qualidade de Vida , Inflamação , Biologia Computacional/métodos
4.
Sci Rep ; 14(1): 7672, 2024 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-38561377

RESUMO

Lipopolysaccharide (LPS) is known to elicit a robust immune response. This study aimed to investigate the impact of LPS on the transcriptome of human nasal epithelial cells (HNEpC). HNEpC were cultured and stimulated with LPS (1 µg/mL) or an equivalent amount of normal culture medium. Subsequently, total RNA was extracted, purified, and sequenced using next-generation RNA sequencing technology. Differentially expressed genes (DEGs) were identified and subjected to functional enrichment analysis. A protein-protein interaction (PPI) network of DEGs was constructed, followed by Ingenuity Pathway Analysis (IPA) to identify molecular pathways influenced by LPS exposure on HNEpC. Validation of key genes was performed using quantitative real-time PCR (qRT-PCR). A total of 97 DEGs, comprising 48 up-regulated genes and 49 down-regulated genes, were identified. Results from functional enrichment analysis, PPI, and IPA indicated that DEGs were predominantly enriched in chemokine-related signaling pathways. Subsequent qRT-PCR validation demonstrated significant upregulation of key genes in these pathways in LPS-treated HNEpC compared to control cells. In conclusion, LPS intervention profoundly altered the transcriptome of HNEpC, potentially exacerbating inflammatory responses through the activation of chemokine-related signaling pathways.


Assuntos
Perfilação da Expressão Gênica , Lipopolissacarídeos , Humanos , Perfilação da Expressão Gênica/métodos , Lipopolissacarídeos/farmacologia , Transcriptoma , Transdução de Sinais/genética , Células Epiteliais , Quimiocinas/genética , Biologia Computacional/métodos
5.
Sci Rep ; 14(1): 8136, 2024 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-38584172

RESUMO

Computational approaches for predicting the pathogenicity of genetic variants have advanced in recent years. These methods enable researchers to determine the possible clinical impact of rare and novel variants. Historically these prediction methods used hand-crafted features based on structural, evolutionary, or physiochemical properties of the variant. In this study we propose a novel framework that leverages the power of pre-trained protein language models to predict variant pathogenicity. We show that our approach VariPred (Variant impact Predictor) outperforms current state-of-the-art methods by using an end-to-end model that only requires the protein sequence as input. Using one of the best-performing protein language models (ESM-1b), we establish a robust classifier that requires no calculation of structural features or multiple sequence alignments. We compare the performance of VariPred with other representative models including 3Cnet, Polyphen-2, REVEL, MetaLR, FATHMM and ESM variant. VariPred performs as well as, or in most cases better than these other predictors using six variant impact prediction benchmarks despite requiring only sequence data and no pre-processing of the data.


Assuntos
Mutação de Sentido Incorreto , Proteínas , Virulência , Proteínas/genética , Sequência de Aminoácidos , Biologia Computacional/métodos
6.
Arq Bras Cardiol ; 121(2): e20230462, 2024.
Artigo em Português, Inglês | MEDLINE | ID: mdl-38597542

RESUMO

BACKGROUND: ST-segment elevation myocardial infarction (STEMI) is one of the leading causes of fatal cardiovascular diseases, which have been the prime cause of mortality worldwide. Diagnosis in the early phase would benefit clinical intervention and prognosis, but the exploration of the biomarkers of STEMI is still lacking. OBJECTIVES: In this study, we conducted a bioinformatics analysis to identify potential crucial biomarkers in the progress of STEMI. METHODS: We obtained GSE59867 for STEMI and stable coronary artery disease (SCAD) patients. Differentially expressed genes (DEGs) were screened with the threshold of |log2fold change| > 0.5 and p <0.05. Based on these genes, we conducted enrichment analysis to explore the potential relevance between genes and to screen hub genes. Subsequently, hub genes were analyzed to detect related miRNAs and DAVID to detect transcription factors for further analysis. Finally, GSE62646 was utilized to assess DEGs specificity, with genes demonstrating AUC results exceeding 75%, indicating their potential as candidate biomarkers. RESULTS: 133 DEGs between SCAD and STEMI were obtained. Then, the PPI network of DEGs was constructed using String and Cytoscape, and further analysis determined hub genes and 6 molecular complexes. Functional enrichment analysis of the DEGs suggests that pathways related to inflammation, metabolism, and immunity play a pivotal role in the progression from SCAD to STEMI. Besides, related-miRNAs were predicted, has-miR-124, has-miR-130a/b, and has-miR-301a/b regulated the expression of the largest number of genes. Meanwhile, Transcription factors analysis indicate that EVI1, AML1, GATA1, and PPARG are the most enriched gene. Finally, ROC curves demonstrate that MS4A3, KLRC4, KLRD1, AQP9, and CD14 exhibit both high sensitivity and specificity in predicting STEMI. CONCLUSIONS: This study revealed that immunity, metabolism, and inflammation are involved in the development of STEMI derived from SCAD, and 6 genes, including MS4A3, KLRC4, KLRD1, AQP9, CD14, and CCR1, could be employed as candidate biomarkers to STEMI.


FUNDAMENTO: O infarto do miocárdio com elevação do segmento ST (IAMCSST) é uma das principais causas de doenças cardiovasculares fatais, que têm sido a principal causa de mortalidade em todo o mundo. O diagnóstico na fase inicial beneficiaria a intervenção clínica e o prognóstico, mas ainda falta a exploração dos biomarcadores do IAMCSST. OBJETIVOS: Neste estudo, conduzimos uma análise bioinformática para identificar potenciais biomarcadores cruciais no progresso do IAMCSST. MÉTODOS: Obtivemos GSE59867 para pacientes com IAMCSST e doença arterial coronariana estável (DACE). Genes diferencialmente expressos (GDEs) foram selecionados com o limiar de |log2fold change| > 0,5 e p < 0,05. Com base nesses genes, conduzimos análises de enriquecimento para explorar a relevância potencial entre genes e para rastrear genes centrais. Posteriormente, os genes centrais foram analisados para detectar miRNAs relacionados e DAVID para detectar fatores de transcrição para análise posterior. Finalmente, o GSE62646 foi utilizado para avaliar a especificidade dos GDEs, com genes demonstrando resultados de AUC superiores a 75%, indicando seu potencial como candidatos a biomarcadores. Posteriormente, os genes centrais foram analisados para detectar miRNAs relacionados e DAVID para detectar fatores de transcrição para análise posterior. Finalmente, o GSE62646 foi utilizado para avaliar a especificidade dos GDEs, com genes demonstrando resultados de AUC superiores a 75%, indicando seu potencial como candidatos a biomarcadores. RESULTADOS: 133 GDEs entre DACE e IAMCSST foram obtidos. Em seguida, a rede PPI de GDEs foi construída usando String e Cytoscape, e análises posteriores determinaram genes centrais e 6 complexos moleculares. A análise de enriquecimento funcional dos GDEs sugere que as vias relacionadas à inflamação, metabolismo e imunidade desempenham um papel fundamental na progressão de DACE para IAMCSST. Além disso, foram previstos miRNAs relacionados, has-miR-124, has-miR-130a/b e has-miR-301a/b regularam a expressão do maior número de genes. Enquanto isso, a análise dos fatores de transcrição indica que EVI1, AML1, GATA1 e PPARG são os genes mais enriquecidos. Finalmente, as curvas ROC demonstram que MS4A3, KLRC4, KLRD1, AQP9 e CD14 exibem alta sensibilidade e especificidade na previsão de IAMCSST. CONCLUSÕES: Este estudo revelou que imunidade, metabolismo e inflamação estão envolvidos no desenvolvimento de IAMCSST derivado de DACE, e 6 genes, incluindo MS4A3, KLRC4, KLRD1, AQP9, CD14 e CCR1, poderiam ser empregados como candidatos a biomarcadores para IAMCSST.


Assuntos
Doença da Artéria Coronariana , MicroRNAs , Infarto do Miocárdio com Supradesnível do Segmento ST , Humanos , Infarto do Miocárdio com Supradesnível do Segmento ST/diagnóstico , Infarto do Miocárdio com Supradesnível do Segmento ST/genética , Perfilação da Expressão Gênica/métodos , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Biomarcadores , MicroRNAs/genética , Fatores de Transcrição/genética , Biologia Computacional/métodos , Inflamação
7.
BMC Bioinformatics ; 25(1): 157, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643108

RESUMO

BACKGROUND: The identification of essential proteins can help in understanding the minimum requirements for cell survival and development to discover drug targets and prevent disease. Nowadays, node ranking methods are a common way to identify essential proteins, but the poor data quality of the underlying PIN has somewhat hindered the identification accuracy of essential proteins for these methods in the PIN. Therefore, researchers constructed refinement networks by considering certain biological properties of interacting protein pairs to improve the performance of node ranking methods in the PIN. Studies show that proteins in a complex are more likely to be essential than proteins not present in the complex. However, the modularity is usually ignored for the refinement methods of the PINs. METHODS: Based on this, we proposed a network refinement method based on module discovery and biological information. The idea is, first, to extract the maximal connected subgraph in the PIN, and to divide it into different modules by using Fast-unfolding algorithm; then, to detect critical modules according to the orthologous information, subcellular localization information and topology information within each module; finally, to construct a more refined network (CM-PIN) by using the identified critical modules. RESULTS: To evaluate the effectiveness of the proposed method, we used 12 typical node ranking methods (LAC, DC, DMNC, NC, TP, LID, CC, BC, PR, LR, PeC, WDC) to compare the overall performance of the CM-PIN with those on the S-PIN, D-PIN and RD-PIN. The experimental results showed that the CM-PIN was optimal in terms of the identification number of essential proteins, precision-recall curve, Jackknifing method and other criteria, and can help to identify essential proteins more accurately.


Assuntos
Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Mapeamento de Interação de Proteínas/métodos , Algoritmos , Mapas de Interação de Proteínas , Biologia Computacional/métodos
8.
Medicine (Baltimore) ; 103(16): e37616, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38640260

RESUMO

Atherosclerosis is a chronic, progressive vascular disease. The relationship between CASP1 gene expression and atherosclerosis remains unclear. The atherosclerosis dataset GSE132651 and GSE202625 profiles were downloaded from gene expression omnibus. Differentially expressed genes (DEGs) were screened. The construction and analysis of protein-protein interaction network, functional enrichment analysis, gene set enrichment analysis, and Comparative Toxicogenomics Database analysis were performed. Gene expression heatmap was drawn. TargetScan was used to screen miRNAs that regulate central DEG. 47 DEGs were identified. According to gene ontology analysis, they were mainly enriched in the regulation of stimulus response, response to organic matter, extracellular region, extracellular region, and the same protein binding. Kyoto Encyclopedia of Gene and Genome analysis results showed that the target cells were mainly enriched in the PI3K-Akt signaling pathway, Ras signaling pathway, and PPAR signaling pathway. In the enrichment project of Metascape, vascular development, regulation of body fluid levels, and positive regulation of cell motility can be seen in the gene ontology enrichment project. Eleven core genes (CASP1, NLRP3, MRC1, IRS1, PPARG, APOE, IL13, FGF2, CCR2, ICAM1, HIF1A) were obtained. IRS1, PPARG, APOE, FGF2, CCR2, and HIF1A genes are identified as core genes. Gene expression heatmap showed that CASP1 was highly expressed in atherosclerosis samples and low expressed in normal samples. NLRP3, MRC1, IRS1, PPARG, APOE, IL13, FGF2, CCR2, ICAM1, HIF1A were low expressed in atherosclerosis samples. CTD analysis showed that 5 genes (CASP1, NLRP3, CCR2, ICAM1, HIF1A) were found to be associated with pneumonia, inflammation, cardiac enlargement, and tumor invasiveness. CASP1 gene is highly expressed in atherosclerosis. The higher the CASP1 gene, the worse the prognosis.


Assuntos
Aterosclerose , Perfilação da Expressão Gênica , Humanos , Perfilação da Expressão Gênica/métodos , Proteína 3 que Contém Domínio de Pirina da Família NLR , Fator 2 de Crescimento de Fibroblastos , Interleucina-13 , PPAR gama , Fosfatidilinositol 3-Quinases , Regulação Neoplásica da Expressão Gênica , Aterosclerose/genética , Apolipoproteínas E , Biologia Computacional/métodos , Redes Reguladoras de Genes
9.
Neurosci Lett ; 828: 137764, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38582325

RESUMO

BACKGROUND: Ataxia Telangiectasia (AT) is a genetic disorder characterized by compromised DNA repair, cerebellar degeneration, and immune dysfunction. Understanding the molecular mechanisms driving AT pathology is crucial for developing targeted therapies. METHODS: In this study, we conducted a comprehensive analysis to elucidate the molecular mechanisms underlying AT pathology. Using publicly available RNA-seq datasets comparing control and AT samples, we employed in silico transcriptomics to identify potential genes and pathways. We performed differential gene expression analysis with DESeq2 to reveal dysregulated genes associated with AT. Additionally, we constructed a Protein-Protein Interaction (PPI) network to explore the interactions between proteins implicated in AT. RESULTS: The network analysis identified hub genes, including TYROBP and PCP2, crucial in immune regulation and cerebellar function, respectively. Furthermore, pathway enrichment analysis unveiled dysregulated pathways linked to AT pathology, providing insights into disease progression. CONCLUSION: Our integrated approach offers a holistic understanding of the complex molecular landscape of AT and identifies potential targets for therapeutic intervention. By combining transcriptomic analysis with network-based methods, we provide valuable insights into the underlying mechanisms of AT pathogenesis.


Assuntos
Ataxia Telangiectasia , Doenças Cerebelares , Humanos , Doenças Neuroinflamatórias , Mapas de Interação de Proteínas , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos
10.
BMC Genomics ; 25(1): 367, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622534

RESUMO

The tissue damage caused by transient ischemic injury is an essential component of the pathogenesis of retinal ischemia, which mainly hinges on the degree and duration of interruption of the blood supply and the subsequent damage caused by tissue reperfusion. Some research indicated that the retinal injury induced by ischemia-reperfusion (I/R) was related to reperfusion time.In this study, we screened the differentially expressed circRNAs, lncRNAs, and mRNAs between the control and model group and at different reperfusion time (24h, 72h, and 7d) with the aid of whole transcriptome sequencing technology, and the trend changes in time-varying mRNA, lncRNA, circRNA were obtained by chronological analysis. Then, candidate circRNAs, lncRNAs, and mRNAs were obtained as the intersection of differentially expression genes and trend change genes. Importance scores of the genes selected the key genes whose expression changed with the increase of reperfusion time. Also, the characteristic differentially expressed genes specific to the reperfusion time were analyzed, key genes specific to reperfusion time were selected to show the change in biological process with the increase of reperfusion time.As a result, 316 candidate mRNAs, 137 candidate lncRNAs, and 31 candidate circRNAs were obtained by the intersection of differentially expressed mRNAs, lncRNAs, and circRNAs with trend mRNAs, trend lncRNAs and trend circRNAs, 5 key genes (Cd74, RT1-Da, RT1-CE5, RT1-Bb, RT1-DOa) were selected by importance scores of the genes. The result of GSEA showed that key genes were found to play vital roles in antigen processing and presentation, regulation of the actin cytoskeleton, and the ribosome. A network included 4 key genes (Cd74, RT1-Da, RT1-Bb, RT1-DOa), 34 miRNAs and 48 lncRNAs, and 81 regulatory relationship axes, and a network included 4 key genes (Cd74, RT1-Da, RT1-Bb, RT1-DOa), 9 miRNAs and 3 circRNAs (circRNA_10572, circRNA_03219, circRNA_11359) and 12 regulatory relationship axes were constructed, the subcellular location, transcription factors, signaling network, targeted drugs and relationship to eye diseases of key genes were predicted. 1370 characteristic differentially expressed mRNAs (spec_24h mRNA), 558 characteristic differentially expressed mRNAs (spec_72h mRNA), and 92 characteristic differentially expressed mRNAs (spec_7d mRNA) were found, and their key genes and regulation networks were analyzed.In summary, we screened the differentially expressed circRNAs, lncRNAs, and mRNAs between the control and model groups and at different reperfusion time (24h, 72h, and 7d). 5 key genes, Cd74, RT1-Da, RT1-CE5, RT1-Bb, RT1-DOa, were selected. Key genes specific to reperfusion time were selected to show the change in biological process with the increased reperfusion time. These results provided theoretical support and a reference basis for the clinical treatment.


Assuntos
MicroRNAs , RNA Longo não Codificante , Traumatismo por Reperfusão , Ratos , Animais , RNA Circular/genética , RNA Longo não Codificante/genética , MicroRNAs/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transcriptoma , Traumatismo por Reperfusão/genética , Biologia Computacional/métodos , Isquemia , Redes Reguladoras de Genes
11.
BMC Musculoskelet Disord ; 25(1): 291, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622662

RESUMO

OBJECTIVES: The aim of this study was to explore the long non-coding RNA (lncRNA) expression profiles in serum of patients with ankylosing spondylitis (AS). The role of these lncRNAs in this complex autoimmune situation needs to be evaluated. METHODS: We used high-throughput whole-transcriptome sequencing to generate sequencing data from three patients with AS and three normal controls (NC). Then, we performed bioinformatics analyses to identify the functional and biological processes associated with differentially expressed lncRNAs (DElncRNAs). We confirmed the validity of our RNA-seq data by assessing the expression of eight lncRNAs via quantitative reverse transcription polymerase chain reaction (qRT-PCR) in 20 AS and 20 NC samples. We measured the correlation between the expression levels of lncRNAs and patient clinical index values using the Spearman correlation test. RESULTS: We identified 72 significantly upregulated and 73 significantly downregulated lncRNAs in AS patients compared to NC. qRT-PCR was performed to validate the expression of selected DElncRNAs; the results demonstrated that the expression levels of MALAT1:24, NBR2:9, lnc-DLK1-35:13, lnc-LARP1-1:1, lnc-AIPL1-1:7, and lnc-SLC12A7-1:16 were consistent with the sequencing analysis results. Enrichment analysis showed that DElncRNAs mainly participated in the immune and inflammatory responses pathways, such as regulation of protein ubiquitination, major histocompatibility complex class I-mediated antigen processing and presentation, MAPkinase activation, and interleukin-17 signaling pathways. In addition, a competing endogenous RNA network was constructed to determine the interaction among the lncRNAs, microRNAs, and mRNAs based on the confirmed lncRNAs (MALAT1:24 and NBR2:9). We further found the expression of MALAT1:24 and NBR2:9 to be positively correlated with disease severity. CONCLUSION: Taken together, our study presents a comprehensive overview of lncRNAs in the serum of AS patients, thereby contributing novel perspectives on the underlying pathogenic mechanisms of this condition. In addition, our study predicted MALAT1 has the potential to be deeply involved in the pathogenesis of AS.


Assuntos
MicroRNAs , RNA Longo não Codificante , Espondilite Anquilosante , Humanos , RNA Longo não Codificante/genética , Perfilação da Expressão Gênica/métodos , Espondilite Anquilosante/genética , MicroRNAs/metabolismo , Biologia Computacional/métodos , Redes Reguladoras de Genes , Proteínas Adaptadoras de Transdução de Sinal/genética , 60528
12.
PLoS One ; 19(4): e0287864, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38626166

RESUMO

The fourth most frequent type of cancer in women and the leading cause of mortality for females worldwide is cervical cancer. Traditionally, medicinal plants have been utilized to treat various illnesses and ailments. The molecular docking method is used in the current study to look into the phytoconstituents of Juglans regia's possible anticancer effects on cervical cancer target proteins. This work uses the microarray dataset analysis of GSE63678 from the NCBI Gene Expression Omnibus database to find differentially expressed genes. Furthermore, protein-protein interactions of differentially expressed genes were constructed using network biology techniques. The top five hub genes (IGF1, FGF2, ESR1, MYL9, and MYH11) are then determined by computing topological parameters with Cytohubba. In addition, molecular docking research was performed on Juglans regia phytocompounds that were extracted from the IMPPAT database versus hub genes that had been identified. Utilizing molecular dynamics, simulation confirmed that prioritized docked complexes with low binding energies were stable.


Assuntos
Juglans , Neoplasias do Colo do Útero , Humanos , Feminino , Simulação de Acoplamento Molecular , Juglans/genética , Juglans/química , Neoplasias do Colo do Útero/tratamento farmacológico , Neoplasias do Colo do Útero/genética , Análise em Microsséries , Biologia Computacional/métodos
13.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581416

RESUMO

The inference of gene regulatory networks (GRNs) from gene expression profiles has been a key issue in systems biology, prompting many researchers to develop diverse computational methods. However, most of these methods do not reconstruct directed GRNs with regulatory types because of the lack of benchmark datasets or defects in the computational methods. Here, we collect benchmark datasets and propose a deep learning-based model, DeepFGRN, for reconstructing fine gene regulatory networks (FGRNs) with both regulation types and directions. In addition, the GRNs of real species are always large graphs with direction and high sparsity, which impede the advancement of GRN inference. Therefore, DeepFGRN builds a node bidirectional representation module to capture the directed graph embedding representation of the GRN. Specifically, the source and target generators are designed to learn the low-dimensional dense embedding of the source and target neighbors of a gene, respectively. An adversarial learning strategy is applied to iteratively learn the real neighbors of each gene. In addition, because the expression profiles of genes with regulatory associations are correlative, a correlation analysis module is designed. Specifically, this module not only fully extracts gene expression features, but also captures the correlation between regulators and target genes. Experimental results show that DeepFGRN has a competitive capability for both GRN and FGRN inference. Potential biomarkers and therapeutic drugs for breast cancer, liver cancer, lung cancer and coronavirus disease 2019 are identified based on the candidate FGRNs, providing a possible opportunity to advance our knowledge of disease treatments.


Assuntos
Redes Reguladoras de Genes , Neoplasias Hepáticas , Humanos , Biologia de Sistemas/métodos , Transcriptoma , Algoritmos , Biologia Computacional/métodos
14.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581418

RESUMO

Following the milestone success of the Human Genome Project, the 'Encyclopedia of DNA Elements (ENCODE)' initiative was launched in 2003 to unearth information about the numerous functional elements within the genome. This endeavor coincided with the emergence of numerous novel technologies, accompanied by the provision of vast amounts of whole-genome sequences, high-throughput data such as ChIP-Seq and RNA-Seq. Extracting biologically meaningful information from this massive dataset has become a critical aspect of many recent studies, particularly in annotating and predicting the functions of unknown genes. The core idea behind genome annotation is to identify genes and various functional elements within the genome sequence and infer their biological functions. Traditional wet-lab experimental methods still rely on extensive efforts for functional verification. However, early bioinformatics algorithms and software primarily employed shallow learning techniques; thus, the ability to characterize data and features learning was limited. With the widespread adoption of RNA-Seq technology, scientists from the biological community began to harness the potential of machine learning and deep learning approaches for gene structure prediction and functional annotation. In this context, we reviewed both conventional methods and contemporary deep learning frameworks, and highlighted novel perspectives on the challenges arising during annotation underscoring the dynamic nature of this evolving scientific landscape.


Assuntos
Aprendizado Profundo , Humanos , Genoma , Algoritmos , Software , Biologia Computacional/métodos , Anotação de Sequência Molecular
15.
Cancer Rep (Hoboken) ; 7(4): e2032, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38577722

RESUMO

BACKGROUND: The diverse and complex attributes of cancer have made it a daunting challenge to overcome globally and remains to endanger human life. Detection of critical cancer-related gene alterations in solid tumor samples better defines patient diagnosis and prognosis, and indicates what targeted therapies must be administered to improve cancer patients' outcome. MATERIALS AND METHODS: To identify genes that have aberrant expression across different cancer types, differential expressed genes were detected within the TCGA datasets. Subsequently, the DEGs common to all pan cancers were determined. Furthermore, various methods were employed to gain genetic alterations, co-expression genes network and protein-protein interaction (PPI) network, pathway enrichment analysis of common genes. Finally, the gene regulatory network was constructed. RESULTS: Intersectional analysis identified UBE2C as a common DEG between all 28 types of studied cancers. Upregulated UBE2C expression was significantly correlated with OS and DFS of 10 and 9 types of cancer patients. Also, UBE2C can be a diagnostic factor in CESC, CHOL, GBM, and UCS with AUC = 100% and diagnose 19 cancer types with AUC ≥90%. A ceRNA network constructed including UBE2C, 41 TFs, 10 shared miRNAs, and 21 circRNAs and 128 lncRNAs. CONCLUSION: In summary, UBE2C can be a theranostic gene, which may serve as a reliable biomarker in diagnosing cancers, improving treatment responses and increasing the overall survival of cancer patients and can be a promising gene to be target by cancer drugs in the future.


Assuntos
Biomarcadores , Neoplasias , Enzimas de Conjugação de Ubiquitina , Humanos , Biomarcadores/metabolismo , Biologia Computacional/métodos , Neoplasias/diagnóstico , Neoplasias/genética , Prognóstico , Mapas de Interação de Proteínas/genética , Enzimas de Conjugação de Ubiquitina/genética , Enzimas de Conjugação de Ubiquitina/metabolismo
16.
BMC Bioinformatics ; 25(1): 145, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580921

RESUMO

BACKGROUND: Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor's performance is still not satisfactory. METHODS: In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF. RESULTS: The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew's-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process. AVAILABILITY: The benchmark datasets and source codes are available in GitHub: http://github.com/Muhammad-Arif-NUST/DPI_CDF .


Assuntos
Proteínas , Software , Sequência de Aminoácidos , Matrizes de Pontuação de Posição Específica , Evolução Biológica , Biologia Computacional/métodos
17.
Virol J ; 21(1): 67, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38509569

RESUMO

Since 1997, highly pathogenic avian influenza viruses, such as H5N1, have been recognized as a possible pandemic hazard to men and the poultry business. The rapid rate of mutation of H5N1 viruses makes the whole process of designing vaccines extremely challenging. Here, we used an in silico approach to design a multi-epitope vaccine against H5N1 influenza A virus using hemagglutinin (HA) and neuraminidase (NA) antigens. B-cell epitopes, Cytotoxic T lymphocyte (CTL) and Helper T lymphocyte (HTL) were predicted via IEDB, NetMHC-4 and NetMHCII-2.3 respectively. Two adjuvants consisting of Human ß-defensin-3 (HßD-3) along with pan HLA DR-binding epitope (PADRE) have been chosen to induce more immune response. Linkers including KK, AAY, HEYGAEALERAG, GPGPGPG and double EAAAK were utilized to link epitopes and adjuvants. This construct encodes a protein having 350 amino acids and 38.46 kDa molecular weight. Antigenicity of ~ 1, the allergenicity of non-allergen, toxicity of negative and solubility of appropriate were confirmed through Vaxigen, AllerTOP, ToxDL and DeepSoluE, respectively. The 3D structure of H5N1 was refined and validated with a Z-Score of - 0.87 and an overall Ramachandran of 99.7%. Docking analysis showed H5N1 could interact with TLR7 (docking score of - 374.08 and by 4 hydrogen bonds) and TLR8 (docking score of - 414.39 and by 3 hydrogen bonds). Molecular dynamics simulations results showed RMSD and RMSF of 0.25 nm and 0.2 for H5N1-TLR7 as well as RMSD and RMSF of 0.45 nm and 0.4 for H5N1-TLR8 complexes, respectively. Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) confirmed stability and continuity of interaction between H5N1-TLR7 with the total binding energy of - 29.97 kJ/mol and H5N1-TLR8 with the total binding energy of - 23.9 kJ/mol. Investigating immune response simulation predicted evidence of the ability to stimulate T and B cells of the immunity system that shows the merits of this H5N1 vaccine proposed candidate for clinical trials.


Assuntos
Virus da Influenza A Subtipo H5N1 , Vacinas , Animais , Humanos , Virus da Influenza A Subtipo H5N1/genética , Epitopos de Linfócito T/genética , Receptor 7 Toll-Like , Receptor 8 Toll-Like , Epitopos de Linfócito B , Biologia Computacional/métodos , Simulação de Acoplamento Molecular , Vacinas de Subunidades/genética
18.
Comput Biol Med ; 171: 108206, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38430745

RESUMO

INTRODUCTION: The rapid growth of omics technologies has led to the use of bioinformatics as a powerful tool for unravelling scientific puzzles. However, the obstacles of bioinformatics are compounded by the complexity of data processing and the distinct nature of omics data types, particularly in terms of visualization and statistics. OBJECTIVES: We developed a comprehensive and free platform, CFViSA, to facilitate effortless visualization and statistical analysis of omics data by the scientific community. METHODS: CFViSA was constructed using the Scala programming language and utilizes the AKKA toolkit for the web server and MySQL for the database server. The visualization and statistical analysis were performed with the R program. RESULTS: CFViSA integrates two omics data analysis pipelines (microbiome and transcriptome analysis) and an extensive array of 79 analysis tools spanning simple sequence processing, visualization, and statistics available for various omics data, including microbiome and transcriptome data. CFViSA starts from an analysis interface, paralleling a demonstration full course to help users understand operating principles and scientifically set the analysis parameters. Once analysis is conducted, users can enter the task history interface for figure adjustments, and then a complete series of results, including statistics, feature tables and figures. All the graphic layouts were printed with necessary statistics and a traceback function recording the options for analysis and visualization; these statistics were excluded from the five competing methods. CONCLUSION: CFViSA is a user-friendly bioinformatics cloud platform with detailed guidelines for integrating functions in multi-omics analysis with real-time visualization adjustment and complete series of results provision. CFViSA is available at http://www.cloud.biomicroclass.com/en/CFViSA/.


Assuntos
Biologia Computacional , Perfilação da Expressão Gênica , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Bases de Dados Factuais , Transcriptoma , Software
19.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38449296

RESUMO

MOTIVATION: The functional complexity of biochemical processes is strongly related to the interplay of proteins and their assembly into protein complexes. In recent years, the discovery and characterization of protein complexes have substantially progressed through advances in cryo-electron microscopy, proteomics, and computational structure prediction. This development results in a strong need for computational approaches to analyse the data of large protein complexes for structural and functional characterization. Here, we aim to provide a suitable approach, which processes the growing number of large protein complexes, to obtain biologically meaningful information on the hierarchical organization of the structures of protein complexes. RESULTS: We modelled the quaternary structure of protein complexes as undirected, labelled graphs called complex graphs. In complex graphs, the vertices represent protein chains and the edges spatial chain-chain contacts. We hypothesized that clusters based on the complex graph correspond to functional biological modules. To compute the clusters, we applied the Leiden clustering algorithm. To evaluate our approach, we chose the human respiratory complex I, which has been extensively investigated and exhibits a known biological module structure experimentally validated. Additionally, we characterized a eukaryotic group II chaperonin TRiC/CCT and the head of the bacteriophage Φ29. The analysis of the protein complexes correlated with experimental findings and indicated known functional, biological modules. Using our approach enables not only to predict functional biological modules in large protein complexes with characteristic features but also to investigate the flexibility of specific regions and coformational changes. The predicted modules can aid in the planning and analysis of experiments. AVAILABILITY AND IMPLEMENTATION: Jupyter notebooks to reproduce the examples are available on our public GitHub repository: https://github.com/MolBIFFM/PTGLtools/tree/main/PTGLmodulePrediction.


Assuntos
Biologia Computacional , Mapeamento de Interação de Proteínas , Humanos , Mapeamento de Interação de Proteínas/métodos , Microscopia Crioeletrônica , Biologia Computacional/métodos , Algoritmos , Proteínas/metabolismo
20.
J Transl Med ; 22(1): 282, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38491529

RESUMO

BACKGROUND: Oral inflammatory diseases are localized infectious diseases primarily caused by oral pathogens with the potential for serious systemic complications. However, publicly available datasets for these diseases are underutilized. To address this issue, a web tool called OralExplorer was developed. This tool integrates the available data and provides comprehensive online bioinformatic analysis. METHODS: Human oral inflammatory disease-related datasets were obtained from the GEO database and normalized using a standardized process. Transcriptome data were then subjected to differential gene expression analysis, immune infiltration analysis, correlation analysis, pathway enrichment analysis, and visualization. The single-cell sequencing data was visualized as cluster plot, feature plot, and heatmaps. The web platform was primarily built using Shiny. The biomarkers identified in OralExplorer were validated using local clinical samples through qPCR and IHC. RESULTS: A total of 35 human oral inflammatory disease-related datasets, covering 6 main disease types and 901 samples, were included in the study to identify potential molecular signatures of the mechanisms of oral diseases. OralExplorer consists of 5 main analysis modules (differential gene expression analysis, immune infiltration analysis, correlation analysis, pathway enrichment analysis and single-cell analysis), with multiple visualization options. The platform offers a simple and intuitive interface, high-quality images for visualization, and detailed analysis results tables for easy access by users. Six markers (IL1ß, SRGN, CXCR1, FGR, ARHGEF2, and PTAFR) were identified by OralExplorer. qPCR- and IHC-based experimental validation showed significantly higher levels of these genes in the periodontitis group. CONCLUSIONS: OralExplorer is a comprehensive analytical platform for oral inflammatory diseases. It allows users to interactively explore the molecular mechanisms underlying the action and regression of these diseases. It also aids dental researchers in unlocking the potential value of transcriptomics data related to oral diseases. OralExplorer can be accessed at https://smuonco.shinyapps.io/OralExplorer/  (Alternate URL: http://robinl-lab.com/OralExplorer ).


Assuntos
Biologia Computacional , Software , Humanos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Transcriptoma/genética , Bases de Dados Factuais , Fatores de Troca de Nucleotídeo Guanina Rho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...